This piece of work investigates to what extent and how weather and time of day influence bike rentals in a public bike sharing system in Montreal. Public data obtained from the Canadian government’s past weather and climate service, as well as bike sharing data available from Kaggle are analyzed via a simple baseline model (moving average) and a more complex machine learning model (gradient boosting regression). Partial dependence plots (PDP) and individual conditional expectation plots (ICE) are used to visualize the influences of the different factors.
Results show that the model can explain the number of hourly bike rides very well (\(93.3\%\) of variance explained). The most important influences on the number of bike rides seem to be temperature, atmospheric pressure, hour of the day and relative humidity, but there are strong interactions between these influences: For example, the number of predicted bike rides increases with temperature, but only if relative humidity is not too high.
The full analysis is available in multiple python files on github: kgl-cycle-share-main-file.py.
A synopsis is available as an ipython notebook cycle-share-analysis-synopsis.ipynb, or as html to download.